Lazy Learner on Decision Tree for Ranking

نویسندگان

  • Yuhong Yan
  • Han Liang
چکیده

This paper aims to improve probability-based ranking (e.g.AUC) under decision-tree paradigm. We observe the fact that probability-based ranking is to sort samples in terms of their class probabilities. Therefore, ranking is a relative evaluation metric among those samples. This motivates us to use a lazy learner to explicitly yield a set of unique class probabilities for a testing sample based on its similarities to the training samples within its neighborhood. We embed lazy learners at the leaves of a decision tree to give class probability assignments. This results in the first model, named Lazy Distancebased Tree (LDTree). Then we further improve this model by continuing to grow the tree for the second time, and call the resulting model Eager Distance-based Tree (EDTree). In addition to the benefits of lazy learning, EDTree also takes advantage of the finer resolution of a large tree structure. We compare our models with C4.5, C4.4 and their variants in AUC on a large suite of UCI sample sets. The improvement shows that our method follows a new path that leads to better ranking performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lazy Learning for Improving Ranking of Decision Trees

Decision tree-based probability estimation has received great attention because accurate probability estimation can possibly improve classification accuracy and probability-based ranking. In this paper, we aim to improve probability-based ranking under decision tree paradigms using AUC as the evaluation metric. We deploy a lazy probability estimator at each leaf to avoid uniform probability ass...

متن کامل

Boosting Lazy Decision Trees

This paper explores the problem of how to construct lazy decision tree ensembles. We present and empirically evaluate a relevancebased boosting-style algorithm that builds a lazy decision tree ensemble customized for each test instance. From the experimental results, we conclude that our boosting-style algorithm significantly improves the performance of the base learner. An empirical comparison...

متن کامل

Linguistic Knowledge Based Supervised Key - phrase Extraction

The most important information about the content of a document is represented by the key phrases of that document. In this study an automatic key phrase extraction algorithm is devised using machine learning technique. The proposed method not only considers the document level statistics like TFxIDF, the linguistic features of the phrases are also incorporated. Experiment has been performed on N...

متن کامل

A Multi-level Classification Model Pertaining to the Student’s Academic Performance Prediction

The students’ performance monitoring and evaluation is an essential activity of an education system to keep track of the success and failure records of the students. The objective of this research is to provide the best classification model to predict the students’ academic performance. In this paper we propose a Multilevel Classification Model (MLCM) based on Decision Tree Algorithm for the pr...

متن کامل

Batched Lazy Decision Trees

We introduce a batched lazy algorithm for supervised classification using decision trees. It avoids unnecessary visits to irrelevant nodes when it is used to make predictions with either eagerly or lazily trained decision trees. A set of experiments demonstrate that the proposed algorithm can outperform both the conventional and lazy decision tree algorithms in terms of computation time as well...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International Journal on Artificial Intelligence Tools

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2008